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ABSTRACT 

We have developed an automated cross-correlation technique to detect 21cm emission 
in sample spectra obtained from the HI Parkes All Sky Survey. The initial sample se- 
lection was the nearest spectra to 2435 low surface brightness galaxies in the catalogue 
of Morshidi-Esslinger et al. (1999). The galaxies were originally selected to have prop- 
erties similar to Fornax cluster dE galaxies. As dE galaxies are generally gas poor it is 
not surprising that there were only 26 secure detections. All of the detected galaxies 
have very high values of (Mh/Lb)q- Thus the HI selection of faint optical sources 
leads to the detection of predominately gas rich galaxies. The gas rich galaxies tend 
to reside on the outskirts of the large scale structure delineated by optically selected 
galaxies, but they do appear to be associated with it. These objects appear to have 
similar relative dark matter content to optically selected galaxies. The HI column den- 
sities are lower than the 'critical density' necessary for sustainable star formation and 
they appear, relatively, rather isolated from companion galaxies. These two factors 
may explain their high relative gas content. We have considered the HI mass function 
by looking at the distribution of velocities of HI detections in random spectra on the 
sky. The inferred HI mass function is steep though confirmation of this results awaits 
a detailed study of the noise characteristics of the HI survey. 
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1 INTRODUCTION 

It is believed that galaxies evolve via star formation from 
initially a gas dominated to finally a stellar (and stellar 
remnants) dominated state. Although the average star for- 
mation rate of the Universe possibly had a peak value some- 
where between z = 1 — 2 (Madau et al. 1996; Lilley et al 



199C ), individual galaxies can have very different star forma- 
tion histories. Star formation seems to have either started 



time. These galaxies would be very difficult to detect in the 
ultra-violet, because of their LSB, and in the far-infrared, 
because of their low gas phase metalicity (hence low dust 
content) and because of the large average distance between 
the stars the dust would be very cold. Given their hypothe- 
sised large relative gas mass the most fruitfull region of the 
spectrum to detect them would seem to be 21cm 

Malin 1 is different to other spiral galaxi es in a number 
of w a ys. For example its [M b: / Lb)® of 5 (Bothun et al 



at diffe rent times and /or has proceeded at different rates at 
different places in the Universe. Elliptical galaxies appear 
to have formed very early and converted their gas into stars 
very quickly whilst, galaxies like th e the giant Low Su r face 



Brightness (LSB) ga laxy Malin 1 (Bothun et al. 1987 



[m- 



pey & Bothun 1989) still have huge reservoirs of gas and 
seem to be forming stars at a (constant ?) slow rate. The 
globally averaged star formation rate of galaxies has been 
determined using rather high surface brightness galaxies, 
measuring either the ultra-violet an d /or the far-infrared lu- 
minosity density ( Blain et al. 199£ ) . What is not clear from 
these measurements is whether there is a significant popu- 
lation of galaxies, similar to Malin 1, that have continued to 
form stars at very much lower rates over longer periods of 



1987 ; tmpey fc Bothun 1989 ) is much higher than the 0.1 for 
a 'typical' spiral galaxy and very much more than 0.01 for a 
'typical' elliptical. While there does seem to be a systematic 
increase in (Mh / Lb)q from ~ 0.05 for early ty pe spirals 
to pa 1 for very late type irregulars (Knapp 1990), galaxies 
with values as high as Malin 1 are quite extraordinary. A 
large value of (Mh/L)q indicates either a galaxy that is 
very young or one that has been forming stars at a very low 
rate, for its mass, compared to other galaxies. If we can find 
larger numbers of galaxies with these properties then we will 
gain a much better understanding of the factors that govern 
galaxy and star formation rates. 

The reasons why galaxies form stars at different rates is 
not totally clear, but there are two very likely prime factors. 
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These are the initial conditions (the gas density at forma- 
tion) a nd the frequency of galactic interactions. High density 



and/or a large number of encounters are both thought to 
promote star formation. Thus elliptical galaxies formed at a 
place of high initial over density and subsequently had many 
interactions and mergers with smaller galaxies, while Malin 
1 probably formed at a spatially large, but small overdensity 
in a very isolated environment. By studying relatively iso- 
lated gas rich galaxies we have the opportunity of studying 
galaxies that have not had rapid star formation induced by 
interactions, undergone mergers and have not been tidally 
stripped. As long as they have not suffered significant expul- 
sion or accretion of gas the mass function of these galaxies 
should reflect the initial mass function of galaxies at their 
time of formation. Gas rich galaxies are thus the best galax- 
ies to compare wi th the initial density fluctuations (Press 
& Schediter 1974) assumed in recent numerical models o f 



spectra are those closest to the optical p ositions of the 2435 
galaxies in the LSB galaxy sample of (Morshidi-Esslinger 



et al. 1999a ) . Where required we have used Ho = 75 km s 



galaxy formation (Frcnk et al. 1996; Kauffman et al. 1997) 



There are two fundamental ways of detecting atomic 
hydrogen in gas rich systems, either by absorption or emis- 
sion. QSO absorption line studies indicate large numbers of 
gas rich systems (from the damped Ly a systems to the Ly a 
forest) most of which have no identifiable optical counter- 
parts. 21cm observations have predominately concentrated 
on, and therefore almost always appear to be associated 
with, optical systems. In addition, the QSO data tends to 
sample the distant Universe while the 21cm observations 
have concentrated on rather nearby objects. QSO absorp- 
tion line observations are generally sensitive to much lower 
column densities (down to « 10 12 atoms cm~ 2 ) than 21cm 
observations (at best « 10 18 atoms cm~ 2 ), but even where 
the two regimes overlap there are many objects that have 
no optical counterparts. Some damped Ly a systems, for ex- 
ample, which were generally believed to arise in the discs 
of 'typical' large spiral galaxies have now, after much closer 
scrutiny, been found to ar i se from absorption in dwarf and 
LSB galaxies ( |Cohen 200$ [Bowen et al. 2000| ). If further ob- 
servations confirm this for other damped Ly a systems then 
we will have to move away from the view that the majority of 
hydrogen absorption lines occur in huge gas haloes around 
'typical' galaxies to one in which the gas is clumped into 
much smaller, previously undetected clouds. So we might 
speculate on the possibility of HI rich clouds like this exist- 
ing nearby and thus being accessible to 21cm observation at 
column densities of ~ 10 18 atoms cm~ 2 or above (for an al- 
ternate view see Rao & Briggs, 1993). As 21cm observations 
become more sensitve and extensive we will be able to test 
this hypothesis . A start can be m ade using the first 21cm 
all sky survey (Barnes et al. 2001). 

So is there a large local population of galaxies that have 
converted only a small fraction of their gas into stars ? If so, 
what is their spatial distribution ? What is the form of their 
HI mass function and how does such a population relate to 
current numerical models of galaxy formation ? To try and 
answer these questions we have used 21cm data taken from 
the HI Parkes All Sky Survey (HIPASS) Q to study the HI 
properties of a sample of LSB galaxies. The extracted HI 



Mpc~ 



2 THE DATA 

The optical data are taken from the photographic survey for 
LSB galaxies carried out by Morshidi-Esslinger et al. (1999a 
and b). The survey covered approximately 2000 sq deg us- 
ing data obtained from APM scans of UK Schmidt telescope 
survey plates. Galaxies were selected to be 'similar' to pre- 
viously detected dE galaxies in the Fornax cluster. We use 
the word 'similar' because the APM automated detection 
routine is optimised to select rather smooth looking images 
like dE galaxies, rather than dl or spiral galaxies. The latter 
tend to have a 'lumpy' appearance which the APM classifier 
often splits into separate or 'noise' images. The photometric 
selection criteria was a central surface brightness in the B 
band fainter than 22.5 B/i and an exponential scale length 
greater than 3 arc sec. Full details of the opti cal survey data 
are given in (Morshidi-Esslinger et al. 1999a). 

The optical data were originally used to study the total 
numbers, numbers in different environments and the clus- 
tering scales of LSB dwarf galaxies. Given that we tried 
to optimise the galaxy selection to dE galaxies we actually 
might not expect any HI detections. Previous observations 
of dE galaxies in cluster s indicate HI masses of less than 10 7 
Mq (Impey et al. 1988) while our sensitivity (see below) is 
only below 10 Mq for velocities less than about 1500 km 
s . Thus we do not expect to be able to detect the nu- 
merically dominant dE galaxies in this sample. Rather, we 
are trying to detect 'interlopers', that is objects that appear 
similar to LSB dE galaxies yet contain larger amounts of 
HI. In fact Malin 1 was discovered in a similar way. It was 
originally thought to be a dwarf galaxy in the Virgo cluster, 
but was late r discovered to be a giant LSB galaxy in the 
background (Botlum et al. 1987). Our models and obser- 



vations (Morshidi-Esslinger et al. 1999a ; Morshidi-Esslinger 
et al. 1999b| ~ indicated some 'background' contamination of 
the optical sample by more distant objects, not necessarily 
dE galaxies. We were hoping that some of these might be 
gas rich galaxies like Malin 1. 

The HI data comes from the 388 8° x 8° survey data 



cube s of the HiPASS 21cm southern sky survey ( Barnes et al 
2001). The angular resolution of the data is 15.5 arcmin 



after the data have been gridded. The grid spacing in each 
cube is 4 arc min and the spectrum used is the nearest spec- 
trum to the optical position. The channel spacing is 13.2 km 
s -1 and the velocity resolution is 27 km s _1 after smoothing. 
There are 1024 channels, but we initially considered only ve- 
locities in the range 400-12000 kms' 1 . The lower limit was 
set to avoid local hydrogen, the upper limit by the prox- 
imity of the velocity cut-off. The typical noise fluctuation 
in each spectrum after smoothing is 0.006 Jy beam -1 . The 
identification of sources in the HI data is described below. 



* The Parkes telescope is part of the Australia Telescope which 
is funded by the Commonwealth of Australia for operation as a 
National Facility managed by CSIRO 
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3 HI DETECTION 



The automated detection to well denned selection criteria 
of images on, for example, CCD frames has become much 
more sophisticated over the last few years (see for example 
( Bertin fc Arnouts 1996 )). Numerous computer packages ex- 
ist to automatically select galaxies to well defined selection 
criteria (isophotal size and magnitude for example). This 
does not appear to be the case for HI detections yet the two 
problems are very similar. For example Kilborn (2001) dis- 
cusses an automated galaxy finder for use on HiPASS data 
cubes, but then resorts to selection by eye. In none of the 
papers on blind HI surveys, we have come acro ss, do we find 



an objective selection criter ia for HI sources (Zwaan et al 



1997 



Schneider et al. 1998). These papers supply informa- 



tion about the observing setup and the data reduction, but 
say little about the detection of objects from the spectra 
obtained. In the main objects appear to be identified by eye 
and there is no explanation of the selection criteria except 
to say (incorrectly) that there is some lower mass limit at 
each distance. 

We have previously been involved with te chniques for 
the detection of LSB galaxies in imaging data (Phillipps & 



Davies L993 



Davies et al. 1994 



Kambas et al. 2000| ). A num 



ber of years ago it had become clear that LSB galaxies were 
very much underrepresented in optically (by eye) selected 
samples taken from imaging data. The lesson learnt was that 
only when a full analysis of the selection process had been 
carried out could you then define the sor ts of galaxie s you 



would and would not be able to dete ct ((Disney 1976; Dis- 
ney & Phillipps & 1983; Davies 199C). Carrying out deeper 
observations with better understood selection criteria has 
led to the detection of numerous LSB galaxies. 

An optical image of a galaxy is detected against a sys- 
tematically varying background level with the addition of 
random noise fluctuations. Detection of the HI signal is very 
similar - the varying background is the base-line ripple and 
in addition there are random noise fluctuations. For optical 
image detection there is no well defined magnitude or size 
limit - sample selection is always a combination of magni- 
tude and size. For example one can always think of a galaxy 
that is bright enough to be part of a magnitude limited 
sample, but fails to get in because it is to large (its surface 
brightness is less than or close to the survey isophotal limit). 
In a similar way large velocity width galaxies with low cen- 
tral intensities will be missed or asigned to base line ripples 
even though they contain sufficient hydrogen, in total, to be 
detected in a 'mass limited' survey. In this section we de- 
scribe how we have applied some of our previous techniques 
of surface photometry to the detection of HI sources. 

Having 2435 spectra to inspect was another strong mo- 
tivation for employing an automated technique. As men- 
tioned above there are two important factors that influence 
our ability to detect 'HI objects' in HI spectra. The first is 
random noise the second is baseline fluctuations. The sig- 
nal is the integral over the line width, so large signals can 
arise from large peak values and/or large velocity widths. 
The problem with identifying large velocity widths without 
large peak values is that they can look the same as baseline 
fluctuations (see also section 6 and figure [ll]) . The problem 
with the random noise is that the expectation (Gaussian) 
is one single channel 3a (a is the standard deviation of the 




Figure 1. A 3<r fluctuation in a HI spectra at fs 6700 km s 1 



data values) fluctuation in the 1000+ channels. Thus 3a de- 
tections (see figure |lj) are not reliable unless they have suf- 
ficiently large velocity widths, but even then, if the velocity 
width is too large, they can resemble baseline fluctuations. 
By 'hiding' simulated galaxies in real spectra it became clear 
that an initial 4a detection was required, because even 3a 
peak values with quite large velocity widths were not con- 
vincingly different to the noise. So the initial object iden- 
tifier was simply one of peak value at the 4a level. At 4a 
we would expect one false detection in every 30 spectra or 
about 80 false detections in the sample as a whole. So the 
second requirement was that the initial peak value detection 
also had a 'resolved' velocity width (figure [|). That is a ve- 
locity width greater than 27 km s _1 . With this criteria we 
would not expect any detections by chance (see also section 
6 below). With these criteria the lowest signal to noise ra- 
tio for detection is about 10, a value that would be readily 
accepted in imaging data. 

Our selection criteria does not lead to an integrated flux 
limited sample (see above), so we will use the term 'survey 
limits' to indicate our two minimal selection criteria. This 
is analogous to what would be referred to (incorrectly) as 
the magnitude limit for a magnitude limited imaging survey 
sample. There are two other points. Firstly our selection cri- 
teria will lead to the preferential selection of face-on, rather 
than edge-on, disc galaxies as these will have higher cen- 
tral intensities and narrower line profiles. Secondly 'spikes' 
in the data like that illustrated in figure ^ are very similar 
in velocity width but, lower in amplitude, to the confirmed 
detection of an apparently isolated HI cloud by Kilborn et 
al. (2000). 

Given the noise in the data it is difficult to measure 
the velocity width and flux integral accurately. To minimise 
this problem we have cross-correlated the data with tem- 
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Figure 2. A 4a detection with a velocity width that is just re- 
solved (An 38 km s —1 ) 



plates and used the best fitting template to derive the cen- 
tral velocity, velocity width and flux integral. Inspection of 
a small part of the data indicated that by far the majority of 
the sources appeared as single 'spikes' rather than 'double 
horned'. The exact form of the template used is not critical, 
but the maximum gain in signal to noise is obtained for an 
'optimum filter', this is one that has the same shape as the 
obje ct being measure d, this is an example of matched filter- 
ing ( Irwin et al. 1990 ) . We have choosen Gaussian templates 
as they appear 'similar' to the profile shapes we are trying 
to measure and they are relatively easy to interpret. The 
cross-correlation program will also measure 'double horned' 
profiles (but not so accurately) by fitting Gaussians around 
the central velocity. 

Essentially we use a technique similar to one we have 



used b efore to detect and measure L SB galaxies (Davie; 
et al. 
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Phillipps & Davies 1993). We have used this 



method to derive the best fitting exponential central surface 
brightness and scale length of LSB galaxy images (photom- 
etry). Here we will derive the central intensity and velocity 
width of the best fitting Gaussian to the HI spectrum. The 
cross-correlat ion technique for surfac e photometry is fully 
described in (Phillipps & Davies 1992 ) (PD). Below we will 



briefly describe our method using similar notation to PD. 

The correlation coefficient of a spectrum G s and a 
model template spectrum Gt is defined in general by the 
convolution 



Cst(r) 



/ G s {x)Gt(x + r)dx 
(J G 2 s (x)dxy/ 2 (f G 2 t {x)d X y/ 2 



(1) 



where r is the shift between the spectrum and template 
and the integrals are taken over their intersection. In prac- 
tice the integrals are taken as sums over the digitised data. 
Cst < 1 regardless of the form of Gt or G s and there is a 



maximum when Gt = aG s , where a is a positive constant 
(see PD). If the data substantially exceeds the scale size of 
the Gaussians used we can ignore the limits on the integrals 
of equation 1. So, if the velocity profile of the template and 
galaxy are both Gaussians then Gt(v) = It exp — (v/at) 2 and 
Gs(v) = I s exp — (v/cts) 2 , where v and a are velocities and 
I is an intensity. Substituting into 1 and integrating gives 



C s t — 



2a s a t 
a 2 s +a 2 t 



(2) 



which obviously has a maximum when a s = a t . By 
convolving with different Gaussian templates over a range 
of velocities we can determine the best matching template 
and the central velocity from the maximum value of C s t- To 
determine the flux integral we also need to know the central 
intensity. To do this we use a second convolution, but with 
a different normalisation 



A st {x) 



J G s {x)Gt{x)dx 



J G 2 {x)dx 
For the Gaussian case this reduces to 



A st 



2a? 



(3) 



(4) 



which implies A s t = 7 1 when a s — a t . So once we have 
selected the correct template (a t ), A s t is just the ratio of 
the unknown intensity I s to the known normalization of the 
model, It- 

The velocity profile will not be a perfect Gaussian so 
the cross-correlation will find a closely matching Gaussian 
model. This will be the 'best' fitting model in the sense of 
minimizing the weighted sum of the deviations. 

In practise we will always have a noisey image. For ex- 
ample, if the noise per pixel is everywhere Gaussian with 
a fixed amplitude a (this assumes the signal is not large 
compared to the noise) then 



G 2 a (v)dv= {I a exp-{v/a s ) 2 + N) 1/2 dv 



(5) 



while the other terms remain unchanged. N represents 
a Gaussian random error term with mean zero and standard 
deviation a. As the cross-terms have an expectation value 
of zero equation 2 becomes 



Cat — 



2a s a t 



(6) 



where xt is the intersection of the spectrum with the 
template. The 'correction' term involves only the parame- 
ters of the noise and the spectrum and so the maximum 
still occurs at a s = a t . As one might expect the effects of 
the noise are minimised for large values of a s I 2 (large ve- 
locity widths and peak values) . During the cross-correlation 
process we kept the intersection, xt, the same for each tem- 
plate so that we could compare the correlation coefficients 
of different templates. 

We have used Gaussian templates with full width half 
maximum values from 25 to 500 km sec' 1 at 12.5 km sec' 1 
intervals. We reject those that are not velocity resolved 
(v < 27 km s _1 ). The largest detected velocity width in 
the sample is 337.5 km s _1 some way below the largest 
template size. The detection process has been fully tested 
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Table 1. A comparision of line of sight velocities obtained from 
NED and measured HI velocities. 



Name 


NED velocity 
km s~ 1 


HI velocity 
km s — 1 


Fl 15-001 


1131 


1305 


F303-023 


4485 


4412 


F304-013 


2250 


2098 


F353-003 


3739 


3922 


F362-027 


1344 


1332 


F410-001 


1545 


1558 


F418-059 


1673 


1758 


F481-018 


2087 


2079 


F483-019 


4128 


4017 


F548-020 


1961 


1958 



on a wide range of simulated and real data. Using simulated 
Gaussian profiles, in real data, with central intensities of 
4<t we find that we can estimate profile parameters and HI 
masses to about a factor of 3. As confirmation of this one 
of the galaxies in the final sample (see below) has a previ- 
ous 21cm measurement. For F300-026 Matthewson and Ford 
(1996) measure an HI mass of 8 x 10 8 Mq (using our de- 
rived distance of 11.3 Mpc) while the value derived from 
our cross-correlation program is 6 x 10 s Mq. We have also 
excluded regions in each spectra that contain known sources 
of noise (HIPASS web page). 

In summary the automated HI detection process in- 
volved firstly the identification of a 4<j or higher value, then 
finding the maximum correlation coefficient with a template 



of velo qity width (full width at half-maximum) greater than 
27 km s~ , the velocity resolution of the data (in practice 
the width of the smallest 'resolved' template of 37.5 km 
s _1 ). This is what we define as our 'sample limits'. 

After carrying out the template matching we had a list 
of 155 HI detections. We then needed to carry out other 
checks to see how secure these detections were. The main 
problem is one of reliably asigning the optical and HI detec- 
tions to the same object. Typically the galaxies in the optical 
sample have diameters of 0.3 arc min. The HI resolution is 
15.5 arc min. To overcome this problem we have used the 
optical Digital Sky Survey (DSS) to inspect the area around 
each HI detection. We have also used the NASA/IPAC Ex- 
tra galactic Database (NED) to find known objects within 10 
arc min of the position of the HI spectra. We have removed 
objects from our initial list if NED has a similar redshift 
for another object within 10 arc min or if there is a more 
prominent galaxy within the field of view. For strong nearby 
signals (v < 2000 km s _1 ) we searched NED for galaxies of 
similar redshift at up to 1 deg away (by looking at spectra at 
random positions around nearby galaxies (such as NGC1365 
and NGC1291) it is clear that they can affect spectra up to 
30 arc min away from their optical centre). As an indica- 
tion that some of our optical and HI detections come from 
the same object we can compare the measured HI veloci- 
ties with previously determined (optical) velocities obtained 
from NED, these are available for 10 galaxies in our sample 
(see table 1). In all cases the optical and HI velocities are in 
good agreement. 

The above procedure resulted in a reduction to 84 detec- 
tions, but it was clear from inspection of the images from the 
DSS that for the most distant objects (greater than about 



6000 km/s) confusion was still a problem. The large beam 
size covered many faint objects not listed in NED, but dis- 
tinctly possible sources of the HI emission. In fact the HI 
emission could arise from the combination of a number of 
sources in the same group. A simulation (see below) of the 
expected number of sources in a set of random beams in- 
dicated that contamination of the sample was possible at a 
level better than about 1 in 4 for a sample limited to 5000 
km/s, but that this drops quickly to about 1 in 2 or worse 
beyond 6000 km/s. Given that our data is not from a set of 
random sight lines we should expect to do better than this 
and so (rather arbitrarily) we set a maximum velocity limit 
of 5500 km/s. This fits in well with the previously measured 
galaxies of table 1, which all have confirmed redshifts below 
this limit. 

This final sample consists of just 26 objects, out of an 
initial sample of 2400, that have both a reasonably secure 
optical and HI detection. Given the above discussion and 
that our sample consists of relatively isolated galaxies (see 
below) we believe that our detections are secure and not 
due to other nearby objects. In figure ^| we have plotted 
the HI mass against the absolute B band magnitude. As 
one might have hoped there is a correspondence between 
the two, supporting our contention that the optical and HI 
detections belong to the same objects. 

At our sample limits we would expect to be able to 
detect fa 2 x 10 7 Mq of hydrogen at our minimum velocity of 
400 km s _1 and « 3 x 10 9 Mq of hydrogen at our maximum 
velocity of 5500 km s _1 . Given that a 'typic al' M* galaxy 
(like the Milky Way) has Mm w 10 10 Mq flZwaan et al 



1997) we can detect galaxies that are gas poor compared to 
M* over our full range of velocities. 

Although the low number of combined optical and HI 
detections is disappointing, it is what we might have ex- 
pected if our original, optical, selection was sound. The op- 
tical selection was designed to select gas poor dE galaxies 
and this is what it appears to have predominately done. The 
only other explanation for the low number of HI detections 
would be that most of the undetected galaxies are at large 
distances (v > 12000 km s _1 ), but this is unlikely given the 
way the optically detecte d galaxies appear to cluster arou nd 
nearby brighter galaxies (Morshidi-Esslinger et al. 1999a). 



4 THE SPATIAL DISTRIBUTION OF THE 
GAS RICH GALAXIES 

In figure ^ we have plotted the positions of the de- 
tected objects compared to the positions of the four major 
groups/clusters in the survey area. The spatial distribution 
of our detections is very different to that of the complete op- 
tical sample (see fig. 8 in Morshidi-Esslinger et al. 1999a). 
The optical galaxies cluster about the group/cluster centres 
while the HI detections appear to avoid the cluster centres 
almost completely. This segregation of gas rich galaxies from 
the gas poor has of course been known for some time. It is 
in low galactic density en vironments that we would expect 
to find galaxies like this ( Solanes et al. 2000 ). It is possible 
that the effect has been enhanced in this case by increased 
'optical confusion' due to the higher density of galaxies in 
the group/cluster centres. 

In (figure m we have plotted a histogram of the line of 
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Figure 3. The HI mass against the absolute B band magnitude 



sight velocities. The detections cluster at about the velocities 
expected for the bright galaxies. Fornax, Dorado, NGC1400 
and Sculptor groups/clusters all have redshifts below 2000 
km s" 1 . Both Jones and Jones (1980) and Fairall (1998) 
show that over this region of sky there is a peak in the num- 
ber density of galaxies at about 1500 km s _1 and then a 
void out to about 4000 km s _1 . This shows that although 
these galaxies avoid group/cluster centres they are still asso- 
ciated with the larger scale structure defined by the brighter 
galaxies. 



5 MASS TO LIGHT RATIOS 

HI masses can be derived using 



M H i = 2.4 x 10 V 



S v dv 



(7) 



where Mhi is the mass of HI in solar units, d is the 
distance to the galaxy in Mpc, S v is the flux density and 
the integral is over velocity. The flux integral is solved us- 
ing the Gaussian parameters of the best fitting template as 
described above. Distances are obtained by converting line 
of sight velocities to velocities relative to the Local Group 
( [Yahil et al. 1977[ ). 

We have calculated stellar masses from the absolute 
blue magnitude of the galaxy assuming a stellar mass to 
light ratio of one (as suggested by Stavely-Smith et al. (1990) 
and de Blok et al. (1996) for gas rich galaxies) and an abso- 



lute blue magnitude for the Sun of Mq = 5.4 (Banks et al 



199£). W e found that all of the photog raphic magnitudes 



listed in ( Morshidi-Esslinger et al. 1999a ) were significantly 
fainter than the available CCD magnitudes listed in NED. 
We have used the NED CCD photometry wherever possi- 
ble and made the photographic magnitudes brighter by the 



Figure 4. The position of the HI rich galaxies in relation to 
nearby groups and clusters. Triangles mark galaxies with veloci- 
ties less than 2250 km squares those with velocities greater 
than 2250 km s~ L 



mean of the CCD correction where this was not possible. 
The mean correction was 0.2 magnitudes. 

In figure ^| we show the distribution of (Mhi / Lb)q for 
our sample. It is quite clear that this sample consists of 
galaxies with extraordinary values of (Mhi / Lb)® (all of the 
8 galaxies with (M H i/L b )q < 1 have (M H i/L b )q > 0.3). 
The relative gas mass of these galaxies is fa r higher than 
that of a 'typical' spiral galaxy (Knapp 1990). 

In figure ^ we show (Mhi / Lb)q plotted against Mb- 



Although there is some scatter a clear trend exists for the 
fainter galaxies to have larger values of (Mhi / Lb)q- A least 
squares fit gives (Mhi / Lb)q oc £,~ oa+ /~ oa an exponent 
very close to the value of -0.3+/-0.1 found by Stavely-Smith 
et al. (1992) for a sample of HI rich dwarf galaxies, but the 
( [gtavely-Smith et al. 1992 ) sample was optically selected and 
has much lower values of (Mhi / Lb)q than this sample. 

In the same way that selection at any wavelength pre- 
dominantly selects objects that are bright at that wave- 
length the combination of relatively faint optical sources 
with HI selection has led to a sample with large values of 
(Mhi / Lb)q- Thus, although small, we have constructed a 
sample of galaxies that apparently have turned only a small 
fraction of their gas into stars. 



6 DYNAMICAL MASSES 

We can make an estimate of the dark matter content of these 
galaxies by comparing the calculated dynamical masses with 
the total HI and stellar mass. The main problem with this 
calculation is knowing the 'dynamical state' of the system. 
That is whether the system is pressure or rotationally sup- 
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Figure 5. The distribution of galaxy velocities. 



Figure 7. Mhi/Lb against absolute blue magnitude for the 
Hi/optical sample 
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Figure 6. The distribution of Mhi/Lb for the Hi/optical sample 



where 6 (arc sec) is the angular size, v (km/s) the line 
of sight velocity, Av (km/s) the velocity width and i the 
inclination (equation adapted from Banks et al. 1999). 

In figure M we show the dynamical mass ratio as a func- 
tion of absolute magnitude. The squares are the calculated 
values taking into account the sin(i) factor for the veloc- 
ity widths (a rotating system) while the triangles are with 
sin(i) — 1, this is essentially the pressure supported case. 
For some cases the sin(i) correction is not large and typi- 
cally the dynamical masses are of order 10 times the mass 
in gas and stars. This is about the same value that more 
typical, lower relative gas mass, galaxies have. For a few 
galaxies the sin(i) correction is large and probably not ap- 
propriate. For example the two faintest objects appear as 
small spheroidal systems and they have Gaussian velocity 
profiles. The sin(i) correction changes their measured dy- 
namical masses by about a factor of 30. We conclude that 
isolated galaxies which have only converted a small fraction 
of their gas mass into stars are dominated by dark matter 
in a similar way to other galaxies. 



ported. Some of the objects appear to be flattened, some are 
round, some have a double-horned velocity profiles, but most 
have what appears to be a Gaussian shape. For these reasons 
we have used the 'indicative dynamical mass' estimator of 
Roberts (1978) to approximate the dynamical masses. We 
have measured optical sizes and inclinations (from the semi- 
minor to semi-major axis ratio) from the DSS images and 
then used 



7 HI COLUMN DENSITIES 

So why have these galaxies only converted a small fraction 
of their HI gas into stars ? Given their apparent rather iso- 
lated positions compared to the optically selected sample an 
obvious solution is that it is interactions with other galax- 
ies that promotes star formation and rapidly consumes gas. 
Galaxies can enhance their star formation rates if encoun- 
ters compress the gas in a similar and additional way to 
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Figure 8. The indicative dynamical mass to total mass 
(Hl+stars) against the B band absolute magnitude. Squares indi- 
cate inclination corrected velocities, triangles no inclination cor- 
rection (pressure supported). 



the spiral density wave which promotes star formation in 
spiral galaxies. Our HI sample galaxies appear to be rather 
isolated and so they must rely on their own internal dynam- 
ics to produce the instabilities in the gas that lead to star 
formation. 

Toomre (1964) has proposed an instability criterion for 
uniformly rotating gas disks (Note that not all of the sam- 
ple galaxies may be discs). This is a critical column den- 
sity below which star formation is surpressed. We can calcu- 
late the Toomre critical column density (YlcrU = 3xl0 9 ^ A " , 
adapted from Kilborn et al. 2000) and compare it with the 
observed mean HI column density of our sample galaxies 
(figure ^) assuming that they are uniformally rotating discs. 
To derive the mean column density we have used the calcu- 
lated HI mass and the measured optical size. In all cases the 
observed HI column density is far less than the calculated 
critical density. Thus these galaxies seem to be deprived of 
both of the prime ingredients of star formation. They have 
relatively low mean HI column densities and they are iso- 
lated from their companions. 

Given the huge spaces between the galaxies one can 
speculate on the numbers of HI clouds that suffer a similar, 
but more extreme, fate than the objects in our sample. 



8 THE HI MASS FUNCTION 

There are far too few galaxies in the optical/HI sample to 
derive an HI mass function in the normal way, but we can in- 
fer the relative numbers of low and high HI mass galaxies in 
another way. One of the great advantages of HI observations, 
compared to imaging, is that we not only obtain HI masses 



Figure 9. The ratio of observed HI column density (Sobs) to 
critical HI column density (EcVit)- 



but also velocities (distances) for the objects. The distribu- 
tion of distances of detected objects can be used to make 
an estimate of the slope of the HI mass function. If there 
are large numbers of massive galaxies compared to low mass 
then there will be a relatively large number of detections 
at large distances. Alternately if most of the systems are of 
low mass then there will be a relatively large number of de- 
tections nearby. There are a number of advantages to this 
method. Firstly we do not have to measure HI masses, we 
only require detections. We can use our automated method 
to detect objects above our well defined survey limits - we do 
not have to rely on detection by eye and a subsequent mass 
measurement. There are no difficult volume corrections to 
be made to each mass interval. We can use just the relative 
numbers at each distance to infer the relative importance 
of high and and low HI mass galaxies. On the down side, if 
we want to derive a more qualative value for the low mass 
slope of the mass function we will still have to compare the 
observations with a rather idealised model. 

In this section we describe an initial, rather simple, ap- 
proach to doing this both with regard to interpreting the 
observational data and modelling the results. The extensive 
HIPASS data warrents a much more detailed analysis of its 
noise properties. We will assume that the noise is gaussian. 
To a good approximation this is true (de Blok, in prepa- 
ration), but even small deviations from gaussian noise may 
alter considerably our conclusions. A simple model for com- 
parision with the observations is described below. 

The volume V (in Mpc 3 ) that a mass of hydrogen Mhi 
can be detected in is 



V 



1 fl 2 
— tt0 

12 



Ml 



3/2 



(9) 



2.4 x 10 s / S v dv 
Where 8 (radians) is the angular diameter of the beam and 
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the flux integral corresponds to our survey limits. If the HI 
mass function has a power law form 



N(M H i)dM„i = <j> 



Mhi 
M* 



dM H i 



(10) 



then the total number of detections to some upper mass 
limit MVir is 



N(M H i)TotdM H i cc 



„ 3 / 2 



f M £L M a dM 



(11) 



Mh}" is the minimum detectable HI mass at a distance d m 
and it can be calculated using equation 7. The first inte- 
gral gives the total number of galaxies with Mhi < Mhi" 
detected within a distance d m , while the second gives the 



distance d m . Assuming that MJP" 1 << 
-2.5 < a < -1. 

N{Mm)Tot oc (M%P) a+5/2 



(12) 



so that a plot of the log of the total numbers out to a given 
distance (velocity) against the log of the distance (velocity) 
will have slope 2a + 5 because M^} 71 oc d^. For example if 
a — —2 then it should have a slope of 1. If we assume that 
our initial 2435 sight lines are random then we can use them 
to carry out the above test by comparing the data with our 
model. Thus we can use all 155 detections rather than just 
those with confirmed optical identifications. 

We have done this after slightly modifying the model 
mass function form to that of the well known Schechter 
function and then solving the above integrals numerically. 



We have used M* = 10 10 M© (Zwaan et al. 1997) in all 



of the simulations, changing only the value of the faint-end 
slope a. The result of comparing the distance distribution 
of our initial 155 detections with various models, each with 
different faint-end slopes, is shown in figure |Io| Each model 
has been normalised to the value of the data at a velocity of 
1000 km/s. We have also indicated on figure ^ a velocity 
of 5500 km s _1 , the velocity below which we believe we can 
optically identify galaxies reliably. The different models are 
still well separate at velocities below this. 

For comparision we have plotted the prediction of 
Zwaan et al. (1997) who derived the HI mass function from 
a HI survey using the Arecibo telescope. We infer a faint-end 
slope much steeper than they derive, though the Zwaan et 
al. val ue is typical of a number of other rec ent determina- 



tions (Banks et al. 1999; Henning et al. 2000). In our sample 
there are relatively too few detections at higher velocities (or 
alternately too many at low velocities) for a luminosity func- 
tion this flat. Our value is consistent though with the steep 
faint-end slope derived by Schneider et al. (1998). Similarly 
steep values like this, for the mass function (dark matter), 
are al so predicted by models of hierarchical str ucture forma- 
tion (|Frenk et al. 199€|; |Kauffman et al. 1997|) 



Our inferred value for a could wrong for a number of 
reasons 

(i) The original optical detection was optimised for 
nearby galaxies. The simulation assumes a uniform distribu- 
tion of galaxies while the data were not uniform covering an 
area of sky containing a number of nearby groups/clusters. 
If we have selected more nearby galaxies than expected in a 



Random 




Log(velocity) 

Figure 10. The cumalative distribution of numbers detected 
against velocity. Various models are shown with dashed lines, 
the data as a solid line. The Zwaan et al. HI mass function has 
a = -1.2. 



random sample then this would mimic a steep mass function 
faint-end slope in our test. To check this we obtained a fur- 
ther 2500 spectra positioned randomly over the whole south- 
ern sky and repeated the cross-correlation analysis. The re- 
sult was 82 detections. This is 73 less than in the previous 
sample, confirming our previous view that a large number 
of the original detections were not associated with the op- 
tical galaxies. The random sample predicts a very similar 
faint end-slope (a ~ —2) to that of the optical sample (see 
figure [H]). Thus the inferred steep faint-end slope does not 
appear to be due to the original optical selection of the tar- 
gets. The distribution of velocities for the random sample is 
quite different to that of the original optical sample. It does 
not show any noticable clustering features. This is not sur- 
prising given the large area covered. The Fairall (1998) maps 
show galaxies at pretty much all velocities if you consider 
an area as large as this. 

(ii) The noise is not gaussian so that the the 4a events 
are noise spikes randomly distributed along the spectrum. 
This is difficult to rule out but if they were uniformly dis- 
tributed along the spectrum then the cumlative number de- 
tected would just be proportional to d m so we would infer 
a value of a — —1.25. Much flatter than we have measured. 
If the noise is not uniformly distributed it is more difficult 
to model, but we can try and assess the probability that 
these 4a fluctuations occur by chance, given the combina- 
tion of random and baseline fluctuations. If the noise was 
purely Gaussian then our minimum detection corresponds 
approximately to one pixel at 4a and 2 at 2a. The probabil- 
ity of this happening by chance is about 10~ 8 . Given that 
we have a little over 10 6 pixels in the random data set we 
would not expect any detections by chance. On the basis 
of Gaussian statistics 82 detections is a highly significant 
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result. We have tried to quantify the effect the baseline fluc- 
tuations may have on this conclusion. By median filtering 
the spectra (filter width of ~ 500 km s" 1 ) to try to remove 
any 'real' detections. This should then leave us with just the 
baseline fluctuations. We have then added Gaussian noise to 
this with a standard deviation the same as that measured 
using a pre-filtered baseline subtracted spectrum. This is il- 
lustrate in figure [ll]. To quantify the uncertainty from base- 
line noise we repeated the cross-correlation of the random 
sight-lines using the simulated smoothed and gaussian noise 
added spectra. There was only one detection. Thus 82 detec- 
tions is still highly significant compared to our expectation 
from the 'model' spectra. 

(iii) At first sight one of the most convincing pieces of 
evidence that these detections are real is the distribution of 
velocities of the original 155 detections from the Hi/optical 
sample. We have already shown that the secure optical de- 
tections cluster in velocity in the same way as the known 
large scale structure over this same area of sky (fig. 5). Fig- 
ure 12 shows that the 4a HI detections from the original 
Hi/optical sample cluster, in v elocity, in just the same way, 



as op tically selected galaxies ( |jones fc Jones 198$ 



Fairall 



1998;). There are strong peaks at ~ 1500 and 4500 km s 



and 105 out of 155 HI detections have velocities less than 
5500 km s~ . In addition the smallest velocity width detec- 
tions (Av < 50 km s _1 ) also cluster in the same way (dot- 
ted lines fig. 12), these are the ones most likely to be noise 
? The problem is that nearby bright galaxies can 'appear' 
in beams at some considerable distance away (see the case 
of NGC1291 and 1365 described in section 3). Given that 
the region observed contains many nearby groups/clusters 
this may be a problem though as described in section 3 it 
is not obvious in any of these cases which galaxy might be 
the cause of the signal. The cumalative effect of the surface 
density nearby bright galaxies on the interpretaion of ran- 
dom sight line data requires further investigation. This is 
something we shall be pursuing further in the future. 

How do these detections compare with other observa- 
tions at 21cm ? Recently Kilborn (2001) has used a large 
subset of the HIPASS data to derive the HI mass function. 
She surveyed « 10 6 Mpc 3 and detected 533 galaxies. Scaling 
this by the ~ 6 x 10 4 Mpc 3 surveyed by our 2500 random 
beams we would expect about 32 galaxies in our sample. We 
actually detected more than twice this number using our au- 
tomated technique. Looking at this a slightly different way 
Kilborn also lists the parameters of the derived mass func- 
tion. Using this data we would predict there to be about 20 
M* (« 10 10 M ) galaxies in our random beams. There are 
actually 22 galaxies with Mhi between 5 x 10 9 and 5 x 10 10 
Mq in the random sample. The two data sets are consistent 
for M* galaxies. 

These results are also not inconsistent with quasar ab- 
sorption line statistics. According to Rao and Briggs (1993) 
we should expect ~10 damped Ly a lines in 1000 sight lines 
to quasars at z=0.65 (q = 0.5). If we treat our 2500 sight 
lines as pencil beams (to give a lower limit) to ~ 12000 km 
s _1 then we would expect 14 damped Ly a systems in our 
random sight lines. As the survey column density limit is 
~ 10 19 atoms cm~ 2 an order of magnitude less than that 
required for a damped Ly a system we would also expect to 
find approximately 14 x (10) 1 ' 67 ~ 650 Lyman limit systems 
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Figure 11. An example of an HI spectrum after median smooth- 
ing and the addition of random noise. The lower spectrum is 
before smoothing, the middle spectrum is the result of median 
smoothing and the upper spectrum has had random noise added. 
The middle and upper specta have been shift in the I direction 
by 0.04 and 0.08 respectively for clarity. 
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Figure 12. The distribution HI velocities from the complete 
Hi/optical sample (solid lines). The dotted line is for detections 
with Av < 50 km s . 
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(Rao & Briggs 1993). Using the numbers against redshift 



relation of Strengler-Larrea et al (1995), which includes Ly- 
man limit system evolution we come to a similar number 
(~ 300). In fact we have found less than we might have ex- 
pected from quasar absorption line statistics. This highlights 
a discrepancy between 21cm and qso absorber observations. 
Either the column density dependence of the frequency of 
occurence of quasar absorption lines is strongly evolving or 
most of the lines fall below our survey limits. Typically Ly- 
man limit absorption lines have measured velocity widths 
of 10-30 km s _1 , but of course the line-of-sight only passes 
through a small part of the object. If the Lyman limit sys- 
tems are pressure supported then 10-30 km s _1 may well be 
the typical velocity dispersion of the gas and we would only 
detect those with the largest line widths. If they are rota- 
tionally supported then perhaps we are only detecting those 
that are sufficiently face-on to have high enough central in- 
tensities to be detected. In either case these observations are 
not inconsistent with absorption line studies of quasars. 39 
of the 82 detections in the random sample have line widths 
of 50 km s _1 or less. 

In the same way as we tried to find corresponding op- 
tical and HI detections for the LSB sample we have also 
tried to identify optical counterparts of the random HI de- 
tections. As described before this is very difficult for distant 
sources, but for nearby objects we might hope that they are 
both larger and brighter. There is a clear distinction be- 
tween the optical identification of small (less than 50 km 
s _1 ) and large velocity width objects. Nearby large velocity 
width objects are invariably associated with a bright galaxy, 
the M* galaxies that we might have expected to detect (see 
above), small velocity widths with apparently blank fields 
on the DSS. 



9 CONCLUSIONS 

Our initial intention was to try and find massive HI galaxies, 
like Malin 1, by looking at the HI properties of a sample of 
LSB galaxies. No objects as extreme as Malin 1, with regard 
to total HI mass, have been found. We have detected a popu- 
lation of extremely gas rich galaxies. These galaxies all have 
masses within the range of previous well studied galaxies. 
One striking feature is the very high HI mass compared to 
stellar mass. These galaxies are either still accumulating gas, 
they are young and/or they are forming stars at a very slow 
rate. Detailed (interferometric) observations are required to 
investigate these alternatives further. The detected galaxies 
reside in regions of low galactic densities where presumably 
they have had little interaction with other galaxies. If galax- 
ies had first been detected in HI, rather than the optical, 
then we would have inferred a very different spatial distri- 
bution on the sky, one that was full of voids rather than 
clusters. Thus HI surveys not only select objects with ex- 
traordinary HI properties they also define a very different 
large scale structure. How extensive this HI rich population 
is, is still an open question that warrants a much more de- 
tailed analysis of the HIPASS data. 
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